-
Notifications
You must be signed in to change notification settings - Fork 24
Update summary.py to include parameter combinations #194
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
For the test cases, we discussed dropping the toy summarization example because it does not have corresponding algorithm-parameter combinations that generated those graphs. We will summarize only the example data and EGFR data that correspond to actual SPRAS runs. |
warning: in the working copy of 'test/analysis/input/example/data0-allpairs-params-BEH6YB2_pathway.txt', LF will be replaced by CRLF the next time Git touches it
spras/analysis/summary.py
Outdated
cur_nw_info.append(params) | ||
|
||
# Prepare column names | ||
col_names = ["Name", "Number of nodes", "Number of undirected edges", "Number of connected components"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should the header be updated to “Number of edges" instead of "Number of undirected edges"?
Even though we’re treating all graphs as undirected (on line 43), directionality doesn’t seem to be considered in any of the statistics being calculated in summary.py at the moment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe the original intention of this header was to convey that even a directed or mixed graph would be parsed and summarized as an undirected graph here. I can see how that could instead be interpreted as only counting the undirected edges in a mixed graph though. Which do you think is more precise?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was looking at the summary file and got confused on why we were only counting the undirected edges (then I looked through the code and realized we weren't doing that). So, I think renaming it to "Number of edges" is more intuitive and putting a comment above line 43 to say that "directed or mixed graph are parsed and summarized as an undirected graph"
Quote strings in parameter output Condense config files Compare summary tables on disk
While doing a final code review, I made some implementation changes. I now quote strings in the parameter combinations in the summary table. However, reading those tables meant that checking the dataframes for equality failed for some reason. Instead, I compare the two versions of the summary table on disk, which works. This code still assumes that the order of the file paths and the algorithms list is the same, which is a bit dangerous. We could parse the file paths to get the algorithm parameter hash codes instead. I won't do that now. It would work for SPRAS workflows but complicate testing. |
This adds a second copy of the PPI network |
No description provided.